Scaling Nonparametric Bayesian Inference via Subsample-Annealing
نویسندگان
چکیده
We describe an adaptation of the simulated annealing algorithm to nonparametric clustering and related probabilistic models. This new algorithm learns nonparametric latent structure over a growing and constantly churning subsample of training data, where the portion of data subsampled can be interpreted as the inverse temperature β(t) in an annealing schedule. Gibbs sampling at high temperature (i.e., with a very small subsample) can more quickly explore sketches of the final latent state by (a) making longer jumps around latent space (as in block Gibbs) and (b) lowering energy barriers (as in simulated annealing). We prove subsample annealing speeds up mixing time N → N in a simple clustering model and exp(N) → N in another class of models, where N is data size. Empirically subsample-annealing outperforms naive Gibbs sampling in accuracyper-wallclock time, and can scale to larger datasets and deeper hierarchical models. We demonstrate improved inference on million-row subsamples of US Census data and network log data and a 307-row hospital rating dataset, using a Pitman-Yor generalization of the Cross Categorization model.
منابع مشابه
Bayesian Nonparametric and Parametric Inference
This paper reviews Bayesian Nonparametric methods and discusses how parametric predictive densities can be constructed using nonparametric ideas.
متن کاملLarge Scale Nonparametric Bayesian Inference: Data Parallelisation in the Indian Buffet Process
Nonparametric Bayesian models provide a framework for flexible probabilistic modelling of complex datasets. Unfortunately, the high-dimensional averages required for Bayesian methods can be slow, especially with the unbounded representations used by nonparametric models. We address the challenge of scaling Bayesian inference to the increasingly large datasets found in real-world applications. W...
متن کاملEmpirical Likelihood Based Posterior Expectation: from nonparametric posterior means via double empirical Bayesian estimators to nonparametric versions of the James-Stein estimator
Posterior expectation is a well-accepted method for data analysis via Bayesian inference based on parametric likelihoods. In this paper we propose utilizing empirical likelihood (EL) methodology to develop novel nonparametric posterior expectation. The parametric Bayesian methodology contains the empirical Bayes approach for the purpose of using the observed data to estimate parameters, or even...
متن کاملBayesian time series models and scalable inference
With large and growing datasets and complex models, there is an increasing need for scalable Bayesian inference. We describe two lines of work to address this need. In the first part, we develop new algorithms for inference in hierarchical Bayesian time series models based on the hidden Markov model (HMM), hidden semi-Markov model (HSMM), and their Bayesian nonparametric extensions. The HMM is ...
متن کاملCoarse-to-fine MCMC in a seismic monitoring system
We apply coarse-to-fine MCMC to perform Bayesian inference for a seismic monitoring system. While traditional MCMC has difficulty moving between local optima, by applying coarse-to-fine MCMC, we can adjust the resolution of the model and this allows the state to jump between different optima more easily. It is quite similar to simulated annealing. We will use a 1D model as an example, and then ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2014